Acknowledgements

This data was gathered by Jake Daniels. It covers data collected on the SEO tag between 2018-01-01 and 2018-12-31 from Medium.com. This includes: the title, date of publication, claps generated, author, reading time, the url. Full text of the articles will be added in the final version.

There’s a total of 19152 articles in this dataset.

Here’s a look at what that looks like:

title date claps author reading_time url
How Airbnb is putting AMP at the core of its digital strategy 2018-01-01 969 Clark Boyd 8 https://medium.com/swlh/how-airbnb-is-putting-amp-at-the-core-of-its-digital-strategy-d6b9cf1fc0ad?source=tag_archive---------0---------------------
Overcome the Google Gods: The Best SEO Learning Tools 2018-02-08 951 The Mission 3 https://medium.com/the-mission/overcome-the-google-gods-the-best-seo-learning-tools-d1de6b170e2?source=tag_archive---------0---------------------
How I built a game in a week 2018-02-12 916 Ada Rose Cannon 10 https://medium.com/samsung-internet-dev/how-i-built-a-game-in-a-week-5810b1197686?source=tag_archive---------1---------------------
White Hat SEO, Black Hat Publishing 2018-12-04 904 Christian A. Dumais 8 https://medium.com/elephate/white-hat-seo-black-hat-publishing-114ad843f9d3?source=tag_archive---------1---------------------
The Answer is No 2018-11-20 892 Christian A. Dumais 6 https://medium.com/elephate/the-answer-is-no-6cd54fc663bb?source=tag_archive---------1---------------------
Secrets Of Success From 6 Figure Bloggers 2018-11-05 874 George J. Ziogas 14 https://medium.com/swlh/secrets-of-success-from-6-figure-bloggers-fa9f15cc5950?source=tag_archive---------2---------------------
91 Experts Share the Most Effective SEO Tips to Drive Traffic to Your Website [Expert Roundup] 2018-02-14 822 Shane Barker 23 https://medium.com/swlh/91-experts-share-the-most-effective-seo-tips-to-drive-traffic-to-your-website-expert-roundup-2c378143b16f?source=tag_archive---------0---------------------
40+ Best Google Search Tips, Tricks & Hacks for 2018 [Infographic] 2018-08-06 813 Michael Tomaszewski 5 https://medium.com/zety/40-best-google-search-tips-tricks-hacks-for-2018-infographic-d10c909cd2f6?source=tag_archive---------0---------------------
Is My Single-Page App SEO Friendly? 2018-06-05 796 Anthony Gore 8 https://medium.com/js-dojo/is-my-single-page-app-seo-friendly-be2c827f1c38?source=tag_archive---------0---------------------
911 2018-09-12 778 SF Ali 5 https://medium.com/@sfali789/911-4bd7f8bc53e4?source=tag_archive---------1---------------------
Wix on Medium: A Fresh Perspective on All Things Internet 2018-12-17 772 Wix 3 https://medium.com/wix-com/welcome-to-wix-on-medium-509d9ead3589?source=tag_archive---------2---------------------
How to avoid the shameful look your site has on Twitter and Facebook 2018-06-07 770 Ohans Emmanuel 7 https://medium.freecodecamp.org/how-to-avoid-the-shaming-look-your-site-has-on-twitter-and-facebook-f2e8f4be568d?source=tag_archive---------0---------------------
5 marketing tips and tricks for your website in 2018 2018-04-24 761 Anna Bazoyan 3 https://medium.com/@annabazoyan/5-marketing-tips-and-tricks-for-your-website-in-2018-e5e182742409?source=tag_archive---------9---------------------
Top 500-+ High PR Do follow Backlinks site list 2018-01-25 755 270+ Web 2.0 Sites List for SEO high pr backlinks 4 https://medium.com/@BestHighPRWeb2.0SitesList/top-500-high-pr-do-follow-backlinks-site-list-8c7633424717?source=tag_archive---------0---------------------
5 MANTRAS TO BUILD A PROFITABLE ONLINE STORE 2018-01-05 750 Prabhash Kumar 4 https://medium.com/webpreneurs-hub/5-mantras-to-build-a-profitable-online-store-d4684f436e50?source=tag_archive---------3---------------------
Graphs & Paths: PageRank. 2018-07-27 726 David Pynes 6 https://towardsdatascience.com/graphs-and-paths-pagerank-54f180a1aa0a?source=tag_archive---------0---------------------
What is Webpreneurship And What It means for your startups? 2018-03-26 706 Prabhash Kumar 5 https://medium.com/webpreneurs-hub/what-is-webpreneurship-and-what-it-means-for-your-startups-a9b4caa0f5d7?source=tag_archive---------12---------------------
Harvard Medical School. 2018-08-24 704 SF Ali NA https://hackernoon.com/harvard-medical-school-339530ef159d?source=tag_archive---------2---------------------
How Bot Mitigation can speed up your website! 2018-07-09 702 John Witter 3 https://medium.com/variti/how-bot-mitigation-can-speed-up-your-website-1f8bad13955e?source=tag_archive---------11---------------------
The Shape of Things to Come 2018-12-11 701 Christian A. Dumais 8 https://medium.com/elephate/the-shape-of-things-to-come-a46682d1f248?source=tag_archive---------0---------------------
6 SEO Experiments That Will Blow Your Mind 2018-06-04 696 Larry Kim 9 https://medium.com/marketing-and-entrepreneurship/6-seo-experiments-that-will-blow-your-mind-513e84410636?source=tag_archive---------0---------------------
Google Doesn’t Have the Guts to Make Page Speed Actually Matter 2018-07-13 684 Dan Fabulich 7 https://redfin.engineering/google-doesnt-have-the-guts-to-make-page-speed-actually-matter-ab2a1a8fe496?source=tag_archive---------1---------------------
How To Build Your Content Marketing Blog 2018-12-29 679 Johann Sigmund 15 https://medium.com/@johann.sigmund/how-to-build-your-content-marketing-blog-6cf3f6c357f2?source=tag_archive---------1---------------------
自然流量新來源?SEO 經營新方向?手機版 Chrome 的文章推薦! 2018-05-24 662 侯智薰(Raymond CH Hou) 6 https://medium.com/@raymondhou/chrome-seo-article-for-you-8a458dda4cba?source=tag_archive---------0---------------------
Efficiently snapshotting your single-page-apps with Puppeteer 2018-02-27 631 Chang Wang 4 https://hackernoon.com/efficiently-snapshotting-spas-with-puppeteer-c4c77aa2831b?source=tag_archive---------0---------------------

Time-series

This is what the article volume looks like over time. This will be scaled beyond a one-year outlook in the final reports.

Post Volume

Aggregated by week. Look for seasonality.

Weekly Topics

We can find the most relevant word of each week by using term-frequency-in-document-frequency or TF-IDF.

Let’s take those keywords and take them astep further by looking at the phrases that also occured that week and see how they relate to the keyword.

Below is a table of the three most relevant phrases along with their keyword.

You can sort by the highest clap averages and see which keyword and phrases contributed to that week being popular.

You can examine the amount of geometric claps that were generated (a measure of success) and the volume of posts (as an adequete sample size).

Frequent Terms

Here are words and phrases that are most used in headlines. The phrases have been stemmed to best gather their relations.

Word and phrase counts give a good signal of what’s being discussed. However, we want to use data science to look further into the effectiveness of these words.

Correlated Words

We’ll examine networks of correlated words and show clusters that are most effective.

Pairwise Correlations

First, let’s pick some popular terms from above and examine their relationships with the words that appear before or after them.

For something different, let’s look at positive verbs instead of terms. This gives more of a feel for what is desired in the industry.

By examing verbs instead of terms, a different set of trends are revealed. Negative words like “remove” or “avoid” are worth exploring too.

Word Networks

We can create networks of these relationships based on how often these words occur beside each other.

Below is a network of the correlated words in the article headers. Each grouping represents a topic.

If we add another dimension, then we can see which of the networks are most effective for generating claps.

How about another dimension? The size of the circles now reflect the volume of that word.

Need help reading the final chart?

Positive Trends * Red is good, especially when it’s a larger node * Networks with red in them represent topics that are popular * Small red nodes will represent under-utilized topics

Negative Trends * Blue is ineffective, especialy when it’s a larger node * Blue nodes MAY have topics that have yet to be packaged correctly * White is neutral, these words/topics are performing at an average rate

Each connected node is part of a topic. We depend on the colours to distinguish which are good at generating claps and which do not.

Topic Clusters

The networks above show relationships between words that create topics.

We’ll try to find topics in the data by looking for clusters of words.We use unsupervised machine-learning to do this– that means we have no desired outcome for the computer to find, so it just digs for patterns that naturally occur in the dataset.

It’s a simple way to get a feel for big trends in the data and what’s currently underway in the industry.

Here’s 5 clusters that hold 8 words to describe the topic:

This creates our topic clusters! Great for brainstorming content and knowing what’s commonly talked about. Let’s shorten the amount of words and increase the number of clusters and see if new ideas emerge.

Tweaking the numbers can form different topics.

These topics can typically be inferenced. It’s not too hard to figure out what each cluster can represent.

Word Impact

Here are words that are impactful/overused. And words that are proven to bring claps.

The size is based on another measurement called geometric mean. It is often used when data is highly skewed. It can be a tie-breaker for clusters that are close together.

We can also make these charts interactive, so anyone can inspect the points.

And here’s a table of those terms. More green adds credibility to the geometric mean (pink) being accurate. Similar to our chart above.

Top 25 Terms
Performance
Word Geometric Average Occurences
content 1.66 773
write 1.59 265
blog 1.12 641
googl 1.09 1337
tool 1.09 441
start 1.07 184
guid 1.05 434
creat 0.94 230
post 0.91 206
organ 0.88 210
traffic 0.87 613
site 0.84 572
step 0.84 318
increas 0.78 321
search 0.78 1350
rank 0.77 795
build 0.76 417
trend 0.76 207
boost 0.75 297
keyword 0.75 408
reason 0.75 304
strategi 0.73 640
wordpress 0.73 324
link 0.72 398
page 0.72 553

Viral Words

Let’s subset popular articles from ones that are average and see if there’s a difference in word usage. We’ll compare articles beyond the 95th quantile of claps per article with those in the 25-75th quantiles and examine their differences.

These are pretty average results when we have such a large sample size. So let’s put both viral and average phrases on the same graphs and look at the difference in proportion.

Word Choice in Viral Posts vs Avg. Posts:

Take a look at the viral phrases.

Thanks for looking at what’s available right now. Plenty more in my notebook that isn’t ready.

At this point, I’ll be holding report for feedback from interested parties in what insights are expected.

Thanks for looking!

Analysis by: Jake Daniels